Few-Shot Composition Learning for Image Retrieval with Prompt Tuning
نویسندگان
چکیده
We study the problem of composition learning for image retrieval, which we learn to retrieve target images with search queries in form a reference and modification text that describes desired modifications image. Existing models retrieval are generally built large-scale datasets, demanding extensive training samples, i.e., query-target pairs, as supervision, restricts their application scenario few-shot only few pairs available. Recently, prompt tuning frozen pretrained language has shown remarkable performance when amount data is limited. Inspired by this, propose mechanism CLIP model task retrieval. Specifically, regard representation trainable visual prompt, prefixed embedding sequence. One challenge efficiently train samples. To deal this issue, further self-upervised auxiliary via ensuring can itself no information given from text, facilitates while not requiring additional annotations pairs. Experiments on multiple benchmarks show our proposed yield superior trained
منابع مشابه
Few-Shot Learning Through an Information Retrieval Lens
Few-shot learning refers to understanding new concepts from only a few examples. We propose an information retrieval-inspired approach for this problem that is motivated by the increased importance of maximally leveraging all the available information in this low-data regime. We define a training objective that aims to extract as much information as possible from each training batch by effectiv...
متن کاملFew-shot Learning
Though deep neural networks have shown great success in the large data domain, they generally perform poorly on few-shot learning tasks, where a classifier has to quickly generalize after seeing very few examples from each class. The general belief is that gradient-based optimization in high capacity classifiers requires many iterative steps over many examples to perform well. Here, we propose ...
متن کاملFew-Shot Learning with Graph Neural Networks
We propose to study the problem of few-shot learning with the prism of inference on a partially observed graphical model, constructed from a collection of input images whose label can be either observed or not. By assimilating generic message-passing inference algorithms with their neural-network counterparts, we define a graph neural network architecture that generalizes several of the recentl...
متن کاملFew-Shot Learning with Meta Metric Learners
Existing few-shot learning approaches are based on either meta-learning or metriclearning, which would suffer if the tasks have varying numbers of classes and/or the tasks diverge significantly. We propose meta metric learning to deal with the limitations of the existing few-shot learning approaches. Our meta metric learning approach consists of two components, task-specific learners that explo...
متن کاملImage-Mediated Learning for Zero-Shot Cross-Lingual Document Retrieval
We propose an image-mediated learning approach for cross-lingual document retrieval where no or only a few parallel corpora are available. Using the images in image-text documents of each language as the hub, we derive a common semantic subspace bridging two languages by means of generalized canonical correlation analysis. For the purpose of evaluation, we create and release a new document data...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i4.25597